Skip to content

vfs: integrate with CJS and ESM module loaders#63653

Open
mcollina wants to merge 3 commits into
nodejs:mainfrom
mcollina:vfs-module-loader-integration
Open

vfs: integrate with CJS and ESM module loaders#63653
mcollina wants to merge 3 commits into
nodejs:mainfrom
mcollina:vfs-module-loader-integration

Conversation

@mcollina
Copy link
Copy Markdown
Member

@mcollina mcollina commented May 30, 2026

Integrate the node:vfs virtual file system with both module loaders so files served from a mounted VFS can be resolved and loaded via require() and import, with first-class support for package.json, conditional exports, extensionless files, etc.

What lands in this PR

Toggleable loader hooks (lib/internal/modules/helpers.js). New
wrappers — loaderStat, loaderReadFile, toRealPath,
loaderLegacyMainResolve, loaderGetFormatOfExtensionlessFile,
loaderGetLayerForPath, and the four loader*PackageJSON variants —
that fall through to the existing C++ binding / fs calls when no VFS
is mounted (zero overhead on the fast path) and dispatch to VFS when
overrides are installed. Setters (setLoaderFsOverrides,
setLoaderPackageOverrides) install/clear the overrides atomically.

Consumers updated: lib/internal/modules/cjs/loader.js,
lib/internal/modules/esm/{resolve,load,get_format}.js, and
lib/internal/modules/package_json_reader.js all now go through the
wrappers. The "DO NOT depend on patchability" warnings in esm/load.js
and esm/resolve.js are preserved and now point at node:vfs and
module.registerHooks() as the formal hook mechanisms.

VFS layer ID. Each VirtualFileSystem instance is assigned a
per-process monotonically increasing layerId at construction
(vfs.layerId). The id is stable across mount/unmount cycles and
shows up in:

  • the NODE_DEBUG=vfs register/deregister events;
  • the ERR_INVALID_STATE message thrown on overlapping mounts;
  • the ?vfs-layer=N URL tag described below.

Scoped cache purging on unmount(). Replaces the earlier global
flush. CommonJS caches (require.cache, the CJS stat cache, the
helpers.js realpath cache, the package_json_reader caches) are
filtered with vfs.shouldHandle(filename). ESM URLs are tagged at
resolve time with ?vfs-layer=N (matching the cache-busting pattern
used by HMR tooling) and the cascaded loader's loadCache is purged
by that tag. Multi-mount setups no longer pay a cross-VFS cache-warmup
penalty on a single unmount, and ESM modules loaded from a VFS become
reachable for purge instead of leaking forever.

User-visible change: import.meta.url for VFS-loaded ES modules ends
in ?vfs-layer=N. CommonJS __filename / module.filename are
unchanged.

Docs. doc/api/vfs.md gains a Mounting section
(mount()/unmount()/mounted/mountPoint/layerId) and a Module
loader integration section covering the resolution behavior, the URL
tag, and the scoped cache purge.

Tests under test/parallel/ (all gated by --experimental-vfs):

  • test-vfs-require.js, test-vfs-import.mjs,
    test-vfs-module-hooks.mjs — CJS / ESM / conditional-exports
    resolution against a VFS.
  • test-vfs-package-json.js, test-vfs-package-json-cache.js,
    test-vfs-invalid-package-json.js — package.json parsing,
    validation parity with the native binding, cache invalidation.
  • test-vfs-module-hooks-cleanup.js — register/deregister cycles,
    cleanup, missing-main ESM error shape.
  • test-vfs-layer-id.jslayerId monotonicity and stability
    across mount/unmount; presence in overlap errors.
  • test-vfs-scoped-cache-purge.js — multi-mount isolation;
    import.meta.url URL tag visibility.

Out of scope (separate follow-ups)

  • SEA VFS integration (the useVfs config flag, asset-as-main).
  • Migration of the C++ package_configs_ cache to the same scoped
    purge approach.
  • Compile-cache / source-map-cache scoping.
  • Permission-model integration (--allow-fs-vfs boundary checks) is
    intentionally not extended beyond the existing gate.

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/loaders

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 30, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

❌ Patch coverage is 90.71207% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.32%. Comparing base (39b481b) to head (51b033a).
⚠️ Report is 26 commits behind head on main.

Files with missing lines Patch % Lines
lib/internal/vfs/setup.js 85.54% 59 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63653      +/-   ##
==========================================
- Coverage   91.96%   90.32%   -1.64%     
==========================================
  Files         379      732     +353     
  Lines      166638   237271   +70633     
  Branches    25497    44727   +19230     
==========================================
+ Hits       153242   214325   +61083     
- Misses      13099    14656    +1557     
- Partials      297     8290    +7993     
Files with missing lines Coverage Δ
lib/internal/modules/cjs/loader.js 98.15% <100.00%> (+18.48%) ⬆️
lib/internal/modules/esm/get_format.js 94.83% <100.00%> (+21.03%) ⬆️
lib/internal/modules/esm/load.js 91.47% <100.00%> (+8.22%) ⬆️
lib/internal/modules/esm/resolve.js 99.04% <100.00%> (+11.14%) ⬆️
lib/internal/modules/helpers.js 98.75% <100.00%> (+7.70%) ⬆️
lib/internal/modules/package_json_reader.js 99.47% <100.00%> (+11.86%) ⬆️
lib/internal/vfs/setup.js 86.14% <85.54%> (-0.55%) ⬇️

... and 472 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mcollina
Copy link
Copy Markdown
Member Author

@joyeecheung take a look, should be easier to review.

Comment thread lib/internal/modules/esm/load.js
Comment thread lib/internal/modules/esm/resolve.js
mcollina added a commit to mcollina/node that referenced this pull request Jun 1, 2026
Restore the "DO NOT depend on the patchability" warnings in esm/load.js
and esm/resolve.js that were dropped along with the fs imports. The
warning still applies; it now also points at node:vfs as one of the
formal hook mechanisms callers should reach for instead.

Addresses review feedback from @jsumners-nr in
nodejs#63653
@mcollina mcollina force-pushed the vfs-module-loader-integration branch from 6321e08 to 51b033a Compare June 1, 2026 15:54
@mcollina mcollina added the request-ci Add this label to start a Jenkins CI on a PR. label Jun 3, 2026
@github-actions github-actions Bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jun 3, 2026
@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Copy link
Copy Markdown
Member

@joyeecheung joyeecheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A design question recently occurred to me: have we explored the versioning of the mounting?

Comment thread lib/internal/vfs/setup.js Outdated
const { clearPackageJSONCache } = require('internal/modules/package_json_reader');
clearPackageJSONCache();
// The ESM cascaded loader's loadCache is intentionally NOT cleared here:
// clearing it mid-flight (while another import() is awaiting nested
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it also the case for require() anyway? (You can unmount it at the top level of a CJS module while the graph is loading and things can be broken there, it's always up to the user to ensure they don't unmount from an unsafe place).

An alternative approach would be to add a cache-busting search params to avoid the stale caches, which solves the identity problem. This is also what the test mock module does and how some user-land HMR works, see #61767 (comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid search params because in my experience they are unreliable (in a graph) and I wanted to minimize memory leaks. Not sure if it's totally possible.

Copy link
Copy Markdown
Member

@joyeecheung joyeecheung Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this now leak all the time whenever new ESM are loaded through the VFS? Unless the identifiers are reused by subsequent mounting (i.e. graph 1, 2, 3 overlaps), the modules stay stale in the cache. With a search param in the identifier I think it will be possible walk through the caches and purge everything identified by an umounted VFS version in the search params during unmounting. Not that it eliminates the leaks (#63186 will continue to be an issue), but at least it reduces the leaks when the graphs don't overlap?

@mcollina
Copy link
Copy Markdown
Member Author

mcollina commented Jun 3, 2026

A design question recently occurred to me: have we explored the versioning of the mounting?

what do you mean? You mean multiple vfs layers on top of each other?

@joyeecheung
Copy link
Copy Markdown
Member

joyeecheung commented Jun 3, 2026

what do you mean? You mean multiple vfs layers on top of each other?

For the stacks to have some kind of version number/ID to identify the current status?

BTW I just noticed that there's no mention of unmount() and mount() in the VFS docs..

@mcollina
Copy link
Copy Markdown
Member Author

mcollina commented Jun 3, 2026

BTW I just noticed that there's no mention of unmount() and mount() in the VFS docs..

I did purge them when doing the splitting; I forgot to bring them back. I'll add them to this PR.

@mcollina
Copy link
Copy Markdown
Member Author

mcollina commented Jun 3, 2026

For the stacks to have some kind of version number/ID to identify the current status?

No but we totally should.

mcollina added 2 commits June 3, 2026 22:15
Route loader fs and package.json operations through toggleable
wrappers so the VFS can resolve and load modules from mounted paths.
When no VFS is mounted, the wrappers take a null-check fast path with
zero overhead.

Hooks:
- loaderStat / loaderReadFile / toRealPath / loaderLegacyMainResolve /
  loaderGetFormatOfExtensionlessFile in
  lib/internal/modules/helpers.js, consumed by cjs/loader.js,
  esm/resolve.js, esm/load.js and esm/get_format.js.
- loaderReadPackageJSON / loaderGetNearestParentPackageJSON /
  loaderGetPackageScopeConfig / loaderGetPackageType, consumed by
  package_json_reader.js.
- setLoaderFsOverrides / setLoaderPackageOverrides install / clear
  all hooks; clearRealpathCache exposes the helpers.js realpath
  cache so deregister can flush it.

lib/internal/vfs/setup.js installs the overrides on first
registerVFS and clears every JS-side loader cache (CJS _pathCache,
CJS stat cache, realpath cache, package.json cache) on every
deregister. The overrides themselves are uninstalled when the last
VFS is removed so the fast path is fully restored.
legacyMainResolve / extensionless-format behavior matches the C++
binding; package.json validation matches src/node_modules.cc
(silently omit non-string main, throw on non-string name/type, etc).

The "DO NOT depend on patchability" warnings in esm/load.js and
esm/resolve.js are preserved and now point at node:vfs and
module.registerHooks() as the formal hook mechanisms.

Tests cover require / import / module-hooks / package.json / cache
invalidation / cleanup-cycle scenarios under --experimental-vfs.

Signed-off-by: Matteo Collina <hello@matteocollina.com>
Each VirtualFileSystem now exposes a per-process monotonically
increasing `layerId`, assigned at construction. The id is stable
across mount/unmount cycles for the lifetime of the instance and
surfaces in:

- debug() output for register / deregister so the layer stack is
  visible when NODE_DEBUG=vfs is enabled;
- the overlap ERR_INVALID_STATE message, which now names the layer
  ids of the conflicting mounts.

The id is the building block for tagging cache entries with the
owning VFS, which a follow-up will use to replace the global
loader-cache flush in deregisterVFS with a scoped purge.

Refs: nodejs#63653
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina force-pushed the vfs-module-loader-integration branch from 51b033a to 294a19c Compare June 4, 2026 07:23
Replace the global loader-cache flush in deregisterVFS with a
scope-purge that only drops entries owned by the unmounting VFS.
Per-layer ownership is determined two ways:

- For CJS-style filename-keyed caches (Module._cache,
  Module._pathCache, the CJS stat cache, the helpers.js realpath
  cache, and the package.json caches) entries are filtered with
  `vfs.shouldHandle(filename)`. __filename stays a clean absolute
  path so user code that does `path.dirname(__filename)` or similar
  is unaffected.

- For the ESM cascaded loader's loadCache, entries are tagged at
  resolve time: when finalizeResolution() detects the resolved
  path is VFS-owned (via the new loaderGetLayerForPath hook), it
  appends `?vfs-layer=<id>` to the URL. The tag surfaces in
  `import.meta.url`, matching the cache-busting pattern used by
  HMR tooling. On deregister, entries whose URL carries the tag
  for the unmounting layer are deleted.

Multi-mount setups no longer pay the cross-VFS cache-warmup
penalty when a single VFS unmounts, and ESM modules loaded from a
VFS become reachable for purge instead of leaking forever in the
cascaded loader.

New helpers exposed for VFS:
- cjs/loader.js: clearStatCacheForVFS
- helpers.js: purgeRealpathCacheForVFS, loaderGetLayerForPath
- package_json_reader.js: purgePackageJSONCacheForVFS

Adds test-vfs-scoped-cache-purge covering both the multi-mount
isolation and the import.meta.url tag visibility.

Refs: nodejs#63653
Signed-off-by: Matteo Collina <hello@matteocollina.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants